Goto

Collaborating Authors

 meta-mdp approach


A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing Systems

In this paper we consider the problem of how a reinforcement learning agent that is tasked with solving a sequence of reinforcement learning problems (a sequence of Markov decision processes) can use knowledge acquired early in its lifetime to improve its ability to solve new problems. We argue that previous experience with similar problems can provide an agent with information about how it should explore when facing a new but related problem. We show that the search for an optimal exploration strategy can be formulated as a reinforcement learning problem itself and demonstrate that such strategy can leverage patterns found in the structure of related problems. We conclude with experiments that show the benefits of optimizing an exploration strategy using our proposed framework.


Reviews: A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing Systems

I am on the fence for this paper. I like that the approach is simple. I like that it was tested in different domains. I like that several different views of performance are included. Nevertheless I have many questions, that need resolution before I can increase my score.


Reviews: A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing Systems

Even after the discussion and the author response there was still some disagreement between the reviewers. The paper proposes a simple yet novel and very interesting idea. There still are a few concerns about clarity, but those can be fixed in the final version (see updated reviews). Overall this is a solid paper, that (as always) would benefit from more thorough empirical evaluation. One reviewer proposed to add an additional baseline of a domain-randomized robust policy that is trained on various tasks.


A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing Systems

In this paper we consider the problem of how a reinforcement learning agent that is tasked with solving a sequence of reinforcement learning problems (a sequence of Markov decision processes) can use knowledge acquired early in its lifetime to improve its ability to solve new problems. We argue that previous experience with similar problems can provide an agent with information about how it should explore when facing a new but related problem. We show that the search for an optimal exploration strategy can be formulated as a reinforcement learning problem itself and demonstrate that such strategy can leverage patterns found in the structure of related problems. We conclude with experiments that show the benefits of optimizing an exploration strategy using our proposed framework.


A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Garcia, Francisco, Thomas, Philip S.

Neural Information Processing Systems

In this paper we consider the problem of how a reinforcement learning agent that is tasked with solving a sequence of reinforcement learning problems (a sequence of Markov decision processes) can use knowledge acquired early in its lifetime to improve its ability to solve new problems. We argue that previous experience with similar problems can provide an agent with information about how it should explore when facing a new but related problem. We show that the search for an optimal exploration strategy can be formulated as a reinforcement learning problem itself and demonstrate that such strategy can leverage patterns found in the structure of related problems. We conclude with experiments that show the benefits of optimizing an exploration strategy using our proposed framework. Papers published at the Neural Information Processing Systems Conference.